[CI/Build][Intel] Enable performance benchmarks for Intel Gaudi 3#26919
[CI/Build][Intel] Enable performance benchmarks for Intel Gaudi 3#26919jikunshang merged 11 commits intovllm-project:mainfrom
Conversation
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run You ask your reviewers to trigger select CI tests on top of Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add If you have any questions, please reach out to us on Slack at https://slack.vllm.ai. 🚀 |
There was a problem hiding this comment.
Code Review
This pull request enables performance benchmarks for Intel Gaudi 3 by adding a new Dockerfile, updating the benchmark script to detect Gaudi devices, and including new test configurations. The changes are well-structured and mostly look good. I've found one potential bug in the benchmark script where a command to check memory usage might fail due to incorrect parsing of the command output. This could prevent the script from correctly waiting for resources to be freed. My review includes a specific suggestion to fix this issue.
41c2a72 to
2e2bcce
Compare
💡 Codex Reviewhttps://github.com/vllm-project/vllm/blob/41c2a72218df813e7725c1fde000b50324e4b57b/docker/Dockerfile.hpu#L44-L56 The https://github.com/vllm-project/vllm/blob/41c2a72218df813e7725c1fde000b50324e4b57b/.buildkite/nightly-benchmarks/scripts/run-performance-benchmarks.sh#L140-L151 In the new HL‑SMI branch of ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. |
411b5f2 to
93bd003
Compare
|
This pull request has merge conflicts that must be resolved before it can be |
b87e940 to
c63a5b0
Compare
Signed-off-by: jakub-sochacki <jakub.sochacki@wp.pl>
Signed-off-by: jakub-sochacki <jakub.sochacki@wp.pl>
Signed-off-by: jakub-sochacki <jakub.sochacki@wp.pl>
Signed-off-by: jakub-sochacki <jakub.sochacki@wp.pl>
Signed-off-by: jakub-sochacki <jakub.sochacki@wp.pl>
Signed-off-by: jakub-sochacki <jakub.sochacki@wp.pl>
Signed-off-by: jakub-sochacki <jakub.sochacki@wp.pl>
Signed-off-by: jakub-sochacki <jakub.sochacki@wp.pl>
Signed-off-by: jakub-sochacki <jakub.sochacki@wp.pl>
Signed-off-by: jakub-sochacki <jakub.sochacki@wp.pl>
c63a5b0 to
135458f
Compare
|
@DarkLight1337 @khluu @jikunshang , may you help to review and merge, meanwhile, there are other two PRs related to this one |
…lm-project#26919) Signed-off-by: jakub-sochacki <jakub.sochacki@wp.pl>
…lm-project#26919) Signed-off-by: jakub-sochacki <jakub.sochacki@wp.pl>
…lm-project#26919) Signed-off-by: jakub-sochacki <jakub.sochacki@wp.pl>
Purpose
Enable Intel Gaudi 3 Accelerator for vLLM Benchmark suite for performance benchmarking.
Test Plan
Models tested: Llama 3.1-8B (TP1), Llama 3.1-70B (TP4), Mixtral 8x7B (TP2)
Scenarios: throughput, latency and serving
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.